LexSchem: a Large Subcategorization Lexicon for French Verbs

نویسندگان

  • Cédric Messiant
  • Thierry Poibeau
  • Anna Korhonen
چکیده

This paper presents LexSchem – the first large, fully automatically acquired subcategorization lexicon for French verbs. The lexicon includes subcategorization frame and frequency information for 3297 French verbs. When evaluated on a set of 20 test verbs against a gold standard dictionary, it shows 0.79 precision, 0.55 recall and 0.65 F-measure. We have made this resource freely available to the research community on the web.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Growing TreeLex

TreeLex is a subcategorization lexicon of French, automatically extracted from a syntactically annotated corpus. The lexicon comprises 2006 verbs (25076 occurrences). The goal of the project is to obtain a list of subcategorization frames of contemporary French verbs and to estimate the number of different verb frames available in French in general. A few more frames are discovered when the cor...

متن کامل

Lexical acquisition from corpora: the case of subcategorization frames in French

We present in this paper a method to automatically acquire a syntactic lexicon of subcategorization frames for French verbs directly from large corpora. The method is evaluated against existing lexical resources: we show that our system is capable of producing new frames that were not previously registered. Lastly, we show that it is possible to induce lexico-semantic classes « à la Levin » (19...

متن کامل

LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon

This paper introduces LexFr, a corpus-based French lexical resource built by adapting the framework LexIt, originally developed to describe the combinatorial potential of Italian predicates. As in the original framework, the behavior of a group of target predicates is characterized by a series of syntactic (i.e., subcategorization frames) and semantic (i.e., selectional preferences) statistical...

متن کامل

A Subcategorization Acquisition System for French Verbs

This paper presents a system capable of automatically acquiring subcategorization frames (SCFs) for French verbs from the analysis of large corpora. We applied the system to a large newspaper corpus (consisting of 10 years of the French newspaper ’Le Monde’) and acquired subcategorization information for 3267 verbs. The system learned 286 SCF types for these verbs. From the analysis of 25 repre...

متن کامل

Automatic extraction of subcategorization frames for French

This paper describes the integration of corpus-based syntactic subcategorization frames into a large-scale, theory-neutral lexical resource for French (Romary et al. (2004)). This database is the first to implement the Lexical Markup Framework (LMF), an international initiative towards ISO standards for lexical databases (ISO TC 37/SC 4). The subcategorization frames have been acquired via a de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008